#################################################################################################################################
#### When to Protect? Using the Crosswise Model to Integrate Protected and Direct Responses in Surveys of Sensitive Behavior ####
#################################################################################################################################

########################################
######  REPLICATION INSTRUCTIONS  ######
########################################

########################################
######  MONTE CARLO SIMULATIONS  #######
########################################


**Before running the R scripts, install the R package "mc2d" ("Tools for Two-Dimensional Monte-Carlo Simulations")**


### REPLICATION STEPS FOR RESULTS IN TABLE 3 ###

I. HIGH EVASIVENESS SCENARIO - DQ - N=2500

1. Open the R script "DQ MONTE CARLOS HE 2500.R".

2. Select "Run All."

3. Wait for the simulation to stop running.

4. Type "BETAS" into the R console and press the ENTER key. The average values of the estimates of beta[0], beta[1], and beta[2] based on using direct
questioning (DQ) under the high evasiveness scenario with n=2500 will be displayed.

5. Type "MSE" into the R console and press the ENTER key. The mean squared error of the estimates of beta[0], beta[1], and beta[2] based on using direct
questioning (DQ) under the high evasiveness scenario with n=2500 will be displayed.

II. HIGH EVASIVENESS SCENARIO - DQ - N=5000
  
1. Open the R script "DQ MONTE CARLOS HE 5000.R".

2. Select "Run All."

3. Wait for the simulation to stop running.

4. Type "BETAS" into the R console and press the ENTER key. The average values of the estimates of beta[0], beta[1], and beta[2] based on using direct
questioning (DQ) under the high evasiveness scenario with n=5000 will be displayed.

5. Type "MSE" into the R console and press the ENTER key. The mean squared error of the estimates of beta[0], beta[1], and beta[2] based on using direct
questioning (DQ) under the high evasiveness scenario with n=5000 will be displayed.

III. SST - N=2500 [RESULTS SAME UNDER HIGH AND LOW EVASIVENESS SCENARIOS]

1. Open the R script "SST MONTE CARLOS 2500.R".

2. Select "Run All."

3. Wait for the simulation to stop running.

4. Type "BETAS" into the R console and press the ENTER key. The average values of the estimates of beta[0], beta[1], and beta[2] based on using sensitive
survey technique questioning (SST) with n=2500 will be displayed.

5. Type "MSE" into the R console and press the ENTER key. The mean squared error of the estimates of beta[0], beta[1], and beta[2] based on using sensitive
survey technique questioning (SST) with n=2500 will be displayed.

IV. SST - N=5000 [RESULTS SAME UNDER HIGH AND LOW EVASIVENESS SCENARIOS]

1. Open the R script "SST MONTE CARLOS 5000.R".

2. Select "Run All."

3. Wait for the simulation to stop running.

4. Type "BETAS" into the R console and press the ENTER key. The average values of the estimates of beta[0], beta[1], and beta[2] based on using sensitive
survey technique questioning (SST) with n=5000 will be displayed.

5. Type "MSE" into the R console and press the ENTER key. The mean squared error of the estimates of beta[0], beta[1], and beta[2] based on using sensitive
survey technique questioning (SST) with n=5000 will be displayed.

V. HIGH EVASIVENESS SCENARIO - Joint - N=2500

1. Open the R script "JOINT RESPONSE MONTE CARLOS HE 2500.R". 

2. Select "Run All."

3. Wait for the simulation to stop running.

4. Type "PARAMETERS" into the R console and press the ENTER key. The average values of the estimates of lambdaT[1], lambdaL[1], lambdaT[0], beta[0], beta[1], and
beta[2] based on using joint response questioning (joint) under the high evasiveness scenario with n=2500 will be displayed.

5. Type "MSE" into the R console and press the ENTER key. The mean squared error of the estimates of lambdaT[1], lambdaL[1], lambdaT[0], beta[0], beta[1], and 
beta[2] based on using joint response questioning (joint) under the high evasiveness scenario with n=2500 will be displayed.

VI. HIGH EVASIVENESS SCENARIO - Joint - N=5000

1. Open the R script "JOINT RESPONSE MONTE CARLOS HE 5000.R". 

2. Select "Run All."

3. Wait for the simulation to stop running.

4. Type "PARAMETERS" into the R console and press the ENTER key. The average values of the estimates of lambdaT[1], lambdaL[1], lambdaT[0], beta[0], beta[1], and
beta[2] based on using joint response questioning (joint) under the high evasiveness scenario with n=5000 will be displayed.

5. Type "MSE" into the R console and press the ENTER key. The mean squared error of the estimates of lambdaT[1], lambdaL[1], lambdaT[0], beta[0], beta[1], and 
beta[2] based on using joint response questioning (joint) under the high evasiveness scenario with n=5000 will be displayed.

VII. LOW EVASIVENESS SCENARIO - DQ - N=2500

1. Open the R script "DQ MONTE CARLOS LE 2500.R".

2. Select "Run All."

3. Wait for the simulation to stop running.

4. Type "BETAS" into the R console and press the ENTER key. The average values of the estimates of beta[0], beta[1], and beta[2] based on using direct
questioning (DQ) under the low evasiveness scenario with n=2500 will be displayed.

5. Type "MSE" into the R console and press the ENTER key. The mean squared error of the estimates of beta[0], beta[1], and beta[2] based on using direct
questioning (DQ) under the low evasiveness scenario with n=2500 will be displayed.

VIII. LOW EVASIVENESS SCENARIO - DQ - N=5000
  
1. Open the R script "DQ MONTE CARLOS LE 5000.R".

2. Select "Run All."

3. Wait for the simulation to stop running.

4. Type "BETAS" into the R console and press the ENTER key. The average values of the estimates of beta[0], beta[1], and beta[2] based on using direct
questioning (DQ) under the low evasiveness scenario with n=5000 will be displayed.

5. Type "MSE" into the R console and press the ENTER key. The mean squared error of the estimates of beta[0], beta[1], and beta[2] based on using direct
questioning (DQ) under the low evasiveness scenario with n=5000 will be displayed.

IX. LOW EVASIVENESS SCENARIO - Joint - N=2500

1. Open the R script "JOINT RESPONSE MONTE CARLOS LE 2500.R". 

2. Select "Run All."

3. Wait for the simulation to stop running.

4. Type "PARAMETERS" into the R console and press the ENTER key. The average values of the estimates of lambdaT[1], lambdaL[1], lambdaT[0], beta[0], beta[1], and beta[2]
based on using joint response questioning (joint) under the low evasiveness scenario with n=2500 will be displayed.

5. Type "MSE" into the R console and press the ENTER key. The mean squared error of the estimates of lambdaT[1], lambdaL[1], lambdaT[0], beta[0], beta[1], and beta[2]
based on using joint response questioning (joint) under the low evasiveness scenario with n=2500 will be displayed.

X. LOW EVASIVENESS SCENARIO - Joint - N=5000

1. Open the R script "JOINT RESPONSE MONTE CARLOS LE 5000.R". 

2. Select "Run All."

3. Wait for the simulation to stop running.

4. Type "PARAMETERS" into the R console and press the ENTER key. The average values of the estimates of lambdaT[1], lambdaL[1], lambdaT[0], beta[0], beta[1], and beta[2]
based on using joint response questioning (joint) under the low evasiveness scenario with n=5000 will be displayed.

5. Type "MSE" into the R console and press the ENTER key. The mean squared error of the estimates of lambdaT[1], lambdaL[1], lambdaT[0], beta[0], beta[1], and beta[2]
based on using joint response questioning (joint) under the low evasiveness scenario with n=5000 will be displayed.

###### REPLICATION OF FIGURE 2 #######

1. Open the R script "Figure 2.R". 

2. Select "Run All."

NOTE: The script utilizes the data generated from the simulations "DQ MONTE CARLOS HE 5000.R", "SST MONTE CARLOS HE 5000.R", and "JOINT RESPONSE MONTE CARLOS HE 5000.R."
Thus, these simulations all must be run to completion before attempting to create the figure.

 
########################################
######  COSTA RICA DATA ANALYSIS  ######
########################################


**Before running the R scripts, install the R package "mc2d" ("Tools for Two-Dimensional Monte-Carlo Simulations")**


### REPLICATION STEPS FOR RESULTS IN TABLE 4 ###

I. SQ1

1. Open the R script "Table 4 SQ1.R". 

2. Select "Run All."

3. Wait for the program to stop running.

4. The output labeled "$Joint.Model.summary" shows the prevalence estimate ("mu") and diagnostic parameters produced by the joint response model (along with confidence
intervals). The output labeled "$DQ.Model.summary" shows the prevalence estimate (and its confidence interval) for direct questioning only. The output labeled
"$SST.Model.summary" shows the prevalence estimate (and its confidence interval) for the sensitive survey technique only.

II. SQ2

1. Open the R script "Table 4 SQ2.R". 

2. Select "Run All."

3. Wait for the program to stop running.

4. The output labeled "$Joint.Model.summary" shows the prevalence estimate ("mu") and diagnostic parameters produced by the joint response model (along with confidence
intervals). The output labeled "$DQ.Model.summary" shows the prevalence estimate (and its confidence interval) for direct questioning only. The output labeled
"$SST.Model.summary" shows the prevalence estimate (and its confidence interval) for the sensitive survey technique only.
  
III. SQ3

1. Open the R script "Table 4 SQ3.R". 

2. Select "Run All."

3. Wait for the program to stop running.

4. The output labeled "$Joint.Model.summary" shows the prevalence estimate ("mu") and diagnostic parameters produced by the joint response model (along with confidence
intervals). The output labeled "$DQ.Model.summary" shows the prevalence estimate (and its confidence interval) for direct questioning only. The output labeled
"$SST.Model.summary" shows the prevalence estimate (and its confidence interval) for the sensitive survey technique only.


IV. SQ4

1. Open the R script "Table 4 SQ4.R". 

2. Select "Run All."

3. Wait for the program to stop running.

4. The output labeled "$Joint.Model.summary" shows the prevalence estimate ("mu") and diagnostic parameters produced by the joint response model (along with confidence
intervals). The output labeled "$DQ.Model.summary" shows the prevalence estimate (and its confidence interval) for direct questioning only. The output labeled
"$SST.Model.summary" shows the prevalence estimate (and its confidence interval) for the sensitive survey technique only.

V. SQ5

1. Open the R script "Table 4 SQ5.R". 

2. Select "Run All."

3. Wait for the program to stop running.

4. The output labeled "$Joint.Model.summary" shows the prevalence estimate ("mu") and diagnostic parameters produced by the joint response model (along with confidence
intervals). The output labeled "$DQ.Model.summary" shows the prevalence estimate (and its confidence interval) for direct questioning only. The output labeled
"$SST.Model.summary" shows the prevalence estimate (and its confidence interval) for the sensitive survey technique only.

###### REPLICATION OF FIGURE 3 #######

I. Calculating the Estimates in the Figure

A. Costa Ricans, SQ1 (willing to pay a bribe to a police officer)

1. Open the R script "Table 3 Estimates Costa Ricans SQ1.R".

2. Select "Run All."

3. Wait for the program to stop running.

4. The output labeled "$Joint.Model.summary" shows the prevalence estimate ("mu") and diagnostic parameters produced by the joint response model (along with confidence
intervals). The output labeled "$DQ.Model.summary" shows the prevalence estimate (and its confidence interval) for direct questioning only. The output labeled
"$SST.Model.summary" shows the prevalence estimate (and its confidence interval) for the sensitive survey technique only.

B. Non-Costa Ricans, SQ1 (willing to pay a bribe to a police officer)

1. Open the R script "Table 3 Estimates Non Costa Ricans SQ1.R".

2. Select "Run All."

3. Wait for the program to stop running.

4. The output labeled "$Joint.Model.summary" shows the prevalence estimate ("mu") and diagnostic parameters produced by the joint response model (along with confidence
intervals). The output labeled "$DQ.Model.summary" shows the prevalence estimate (and its confidence interval) for direct questioning only. The output labeled
"$SST.Model.summary" shows the prevalence estimate (and its confidence interval) for the sensitive survey technique only.

C. Costa Ricans, SQ2 (has paid a bribe to a police officer)

1. Open the R script "Table 3 Estimates Costa Ricans SQ2.R".

2. Select "Run All."

3. Wait for the program to stop running.

4. The output labeled "$Joint.Model.summary" shows the prevalence estimate ("mu") and diagnostic parameters produced by the joint response model (along with confidence
intervals). The output labeled "$DQ.Model.summary" shows the prevalence estimate (and its confidence interval) for direct questioning only. The output labeled
"$SST.Model.summary" shows the prevalence estimate (and its confidence interval) for the sensitive survey technique only.

D. Non-Costa Ricans, SQ2 (has paid a bribe to a police officer)

1. Open the R script "Table 3 Estimates Non Costa Ricans SQ2.R".

2. Select "Run All."

3. Wait for the program to stop running.

4. The output labeled "$Joint.Model.summary" shows the prevalence estimate ("mu") and diagnostic parameters produced by the joint response model (along with confidence
intervals). The output labeled "$DQ.Model.summary" shows the prevalence estimate (and its confidence interval) for direct questioning only. The output labeled
"$SST.Model.summary" shows the prevalence estimate (and its confidence interval) for the sensitive survey technique only.


II. Drawing the Figure

1. Open the R script "Figure 3.R". 

2. Select "Run All."

### REPLICATION STEPS FOR RESULTS IN TABLE 5 ###

I. Results for Table 5, SQ1 (Willing to pay a bribe)

1. Open the R script "Table 5 SQ1.R".

2. Select "Run All." NOTE: Uncertainty estimates (standard errors, confidence intervals) are calculated via the parametric bootstrap. Given the computational
demands of estimating the explanatory joint response model, CPU run time based on six hundred bootstrap simulations--what we use here--can last 40 minutes or so.

3. Scroll to the output labeled "$parameter.summary." This output presents the point estimates, standard errors, and 95% confidence intervals for the model
parameters listed under the "model 1" heading in Table 2. These figures are presented in the same order as they are in Table 5.

4. Scroll down to the output labeled $N.obs, $avg.pred.diff, avg.pred.diff.SE, and avg.pred.diff.CI. These display the number of observations, APD point estimate,
APD standard error, and APD 95% confidence intervals, respectively.

NOTE: Since uncertainty estimates are based on 600 randomly generated bootstrap samples, there are very small fluctuations in standard errors and confidence intervals each
time the program is run. The standard error estimates should be accurate to the second decimal point from run to run, whereas there can be slightly greater variation from run
to run for the confidence interval estimates.

II. Results for Table 5, SQ2 (Paid a bribe)

1. Open the R script "Table 5 SQ2.R".

2. Select "Run All." NOTE: Uncertainty estimates (standard errors, confidence intervals) are calculated via the parametric bootstrap. Given the computational
demands of estimating the explanatory joint response model, CPU run time based on six hundred bootstrap simulations--what we use here--can last 40 minutes or so.

3. Scroll to the output labeled "$parameter.summary." This output presents the point estimates, standard errors, and 95% confidence intervals for the model
parameters listed under the "model 1" heading in Table 2. These figures are presented in the same order as they are in Table 5.

4. Scroll down to the output labeled $N.obs, $avg.pred.diff, avg.pred.diff.SE, and avg.pred.diff.CI. These display the number of observations, APD point estimate,
APD standard error, and APD 95% confidence intervals, respectively.

NOTE: Since uncertainty estimates are based on 600 randomly generated bootstrap samples, there are very small fluctuations in standard errors and confidence intervals each
time the program is run. The standard error estimates should be accurate to the second decimal point from run to run, whereas there can be slightly greater variation from run
to run for the confidence interval estimates.





